g

g

,

s, which may be differential, were missed (hence the Type II

ue to the high extreme outliers occurring in the cancer replicates.

. A high extreme outlier in gene 216987_at in GDS3139 missed a true DEG.

PA

er profile outlier analysis algorithm (COPA) was perhaps the

ne for discovering heterogeneous DEGs [Tomlins, et al., 2005].

as modified the t statistic as a ratio of the distance between the rth

value was the 9th) percentile of the case expressions and the

of all expressions over the median absolute distance (deviated

whole population median of the gene). The COPA t statistic is

s below,

ݐ஼ை௉஺ݍሺܡሻെߣ

ߪ

(6.10)

ൌ1.4826 ൈmedianሼܠെߣ, ܡെߣሽ, ܠ stands for a vector of the

xpressions and ܡ stands for a vector of case expressions, ݍሺܡሻ

r the rth percentile of ܡ and ߣ is the median of both ܠ and ܡ. The

on of the rth percentile is irrelevant to the number of outliers.

e outlier number is small, the difference between ݍሺܡሻ and ߣ is

owever, when the outlier number is large, the difference between

d ߣ is large. The rth percentile also depends on the distribution of

ta set. The COPA p values are calculated by the permutation

.